Quick initial exploration of main characteristics from the German skills matrices. The dataset used in this case is the ‘wz08.dta’, a 2-digit industry matrix.
Reading the data into a network object:
library(foreign)
library(igraph)
Attaching package: ‘igraph’
The following objects are masked from ‘package:stats’:
decompose, spectrum
The following object is masked from ‘package:base’:
union
library(dplyr)
Attaching package: ‘dplyr’
The following objects are masked from ‘package:igraph’:
as_data_frame, groups, union
The following objects are masked from ‘package:stats’:
filter, lag
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
# read data
data_nace2 <- data.frame(read.dta("/Users/crangelsmith/PycharmProjects/KnowledgeFlows/data/MR_04-17_EN_data_GermanySkillsMatrix/wz08.dta"))
length(unique(c(data_nace2$wz08_1,data_nace2$wz08_2)))
[1] 597
hist(data_nace2$SRt,main="Edge weight SRt",breaks=50)
# select edges where the flow is higher than random
data_nace2_df <- data_nace2[data_nace2$SRt>0,]
Top 20 flows:
library(knitr)
top_nace2_df <- data_nace2_df[order(-data_nace2_df$SRt),]
kable(top_nace2_df[top_nace2_df$SRt!=1,][1:20,])
| wz08_1 | wz08_2 | SRt | |
|---|---|---|---|
| 22727 | Mining of chemical and fertiliser minerals | Extraction of salt | 0.9994 |
| 23919 | Extraction of salt | Mining of chemical and fertiliser minerals | 0.9993 |
| 94483 | Manufacture of plaster products for construction purposes | Manufacture of lime and plaster | 0.9981 |
| 247574 | Inland passenger water transport | Inland freight water transport | 0.9973 |
| 41794 | Manufacture of wine from grape | Growing of grapes | 0.9971 |
| 1862 | Growing of grapes | Manufacture of wine from grape | 0.9966 |
| 18027 | Mining of hard coal | Manufacture of coke oven products | 0.9966 |
| 352436 | Repair of watches, clocks and jewellery | Manufacture of watches and clocks | 0.9966 |
| 338466 | Operation of arts facilities | Performing arts | 0.9965 |
| 336678 | Performing arts | Operation of arts facilities | 0.9963 |
| 18551 | Mining of lignite | Support activities for other mining and quarrying | 0.9960 |
| 90296 | Manufacture of ceramic sanitary fixtures | Manufacture of ceramic tiles and flags | 0.9952 |
| 248170 | Inland freight water transport | Inland passenger water transport | 0.9952 |
| 352627 | Repair of watches, clocks and jewellery | Retail sale of watches and jewellery in specialised stores | 0.9951 |
| 246982 | Sea and coastal freight water transport | Service activities incidental to water transportation | 0.9948 |
| 167572 | Trade of gas through mains | Transport via pipeline | 0.9946 |
| 237003 | Retail sale of watches and jewellery in specialised stores | Repair of watches, clocks and jewellery | 0.9946 |
| 251154 | Service activities incidental to water transportation | Sea and coastal freight water transport | 0.9946 |
| 88508 | Manufacture of ceramic tiles and flags | Manufacture of ceramic sanitary fixtures | 0.9943 |
| 263720 | Motion picture, video and television programme production activities | Motion picture, video and television programme post-production activities | 0.9943 |
#build the graph
nace2_net <- graph_from_data_frame(top_nace2_df,directed = FALSE)
V(nace2_net) # The vertices of the "nace2_net" object
+ 597/597 vertices, named, from 0ee0a31:
[1] Growing of cereals (except rice), leguminous crops and oil seeds
[2] Growing of vegetables and melons, roots and tubers
[3] Growing of other non-perennial crops
[4] Growing of grapes
[5] Growing of pome fruits and stone fruits
[6] Growing of other tree and bush fruits and nuts
[7] Growing of beverage crops
[8] Growing of spices, aromatic, drug and pharmaceutical crops
[9] Growing of other perennial crops
[10] Plant propagation
+ ... omitted several vertices
E(nace2_net)$weight <-E(nace2_net)$SRt
The proportion of present edges from all possible edges in the network.
edge_density(nace2_net, loops=F)
[1] 0.1566614
Trasitivity (clustering): measures that probability that adjacent nodes of a network are connected. In other words, if i is connected to j, and j is connected to k, what is the probability that i is also connected to k?
transitivity(nace2_net, type="global") # net is treated as an undirected network
[1] 0.3600216
Node degrees: the number of adjacent edges to each node.
Strength is a weighted measure of degree that takes into account the number of edges that go from one node to another.
In this example we use the mode “out”, showing the number of job changes leaving the occupation.
sort(strength(nace2_net,mode="out"))[1:5]
Growing of beverage crops Mining of other non-ferrous metal ores
2 2
Manufacture of other non-distilled fermented beverages Manufacture of essential oils
2 2
Manufacture of non-electric domestic appliances
2
sort(-strength(nace2_net))[1:5]
Manufacture of other plastic products Machining
-107.3118 -100.1353
Manufacture of other parts and accessories for motor vehicles Manufacture of metal structures and parts of structures
-95.5193 -94.8034
Manufacture of other fabricated metal products n.e.c.
-94.6890
p1 <- hist(degree(nace2_net),breaks=50,plot=FALSE)
p2 <- hist(strength(nace2_net),breaks=30,plot=FALSE)
plot( p2, col=rgb(0,0,1,1/4), xlim=c(0,250),main ="Streght Degree")
plot( p1, col=rgb(1,0,0,1/4), xlim=c(0,250), add=T) # second
legend(200,50,legend=c("degree","strength"), col=c(rgb(0,0,1,1/4),rgb(1,0,0,1/4)), bty = "n",ncol=1, pch=c(15,15),lty=c(1,2,3))
` The graph is disconected. There are vertex that are only conected to themselves.
dg <- decompose.graph(nace2_net) # returns a list of three graphs
for (i in 2:length(dg)){
print(V(dg[[i]]))
print("SRt weight: ",E(dg[[i]])$SRt)
}
+ 1/1 vertex, named, from 940af05:
[1] Growing of beverage crops
[1] "SRt weight: "
+ 1/1 vertex, named, from 4ab8221:
[1] Mining of other non-ferrous metal ores
[1] "SRt weight: "
+ 1/1 vertex, named, from bfdef92:
[1] Manufacture of other non-distilled fermented beverages
[1] "SRt weight: "
+ 1/1 vertex, named, from e6d7992:
[1] Manufacture of essential oils
[1] "SRt weight: "
+ 1/1 vertex, named, from 08a1601:
[1] Manufacture of non-electric domestic appliances
[1] "SRt weight: "
number of items to replace is not a multiple of replacement length
print (ceb$modularity)
[1] 0.4815053 0.5037376
print_cummunity_vertex(node_list,1, 0.2)
[1] "Community: 1"
[1] "Color: gray50"
[1] "Total number of vertices: 93"
[1] "% of vertices connected to other communities 0.107142857142857"
[1] "Wholesale of watches and jewellery"
[2] "Retail sale via stalls and markets of textiles, clothing and footwear"
[3] "Retail sale of games and toys in specialised stores"
[4] "Retail sale of footwear and leather goods in specialised stores"
[5] "Processing of nuclear fuel"
[6] "Manufacture of glues"
[7] "Wholesale of office furniture"
[8] "Manufacture of other chemical products n.e.c."
[9] "Manufacture of paper stationery"
[10] "Manufacture of paints, varnishes and similar coatings, printing ink and mastics"
[11] "Manufacture of refined petroleum products"
[12] "Manufacture of luggage, handbags and the like, saddlery and harness"
[13] "Retail sale via mail order houses or via Internet"
[14] "Repair of other personal and household goods"
[15] "Manufacture of other articles of paper and paperboard"
[16] "Manufacture of other organic basic chemicals"
[17] "Repair of watches, clocks and jewellery"
[18] "Manufacture of soap and detergents, cleaning and polishing preparations"
print_cummunity_vertex(node_list,2, 0.2)
[1] "Community: 2"
[1] "Color: tomato"
[1] "Total number of vertices: 97"
[1] "% of vertices connected to other communities 0.102272727272727"
[1] "Test drilling and boring"
[2] "Manufacture of other builders' carpentry and joinery"
[3] "Manufacture of ceramic sanitary fixtures"
[4] "Manufacture of sports goods"
[5] "Construction of water projects"
[6] "Collection of non-hazardous waste"
[7] "Support activities for other mining and quarrying"
[8] "Retail sale of hardware, paints and glass in specialised stores"
[9] "Plumbing, heat and air-conditioning installation"
[10] "Treatment and disposal of hazardous waste"
[11] "Wholesale of waste and scrap"
[12] "Agents involved in the sale of furniture, household goods, hardware and ironmongery"
[13] "Repair of other equipment"
[14] "Manufacture of office and shop furniture"
[15] "Distribution of electricity"
[16] "Manufacture of ceramic tiles and flags"
[17] "Manufacture of concrete products for construction purposes"
[18] "Manufacture of kitchen furniture"
[19] "Shaping and processing of flat glass"
print_cummunity_vertex(node_list,3, 0.3)
[1] "Community: 3"
[1] "Color: gold"
[1] "Total number of vertices: 96"
[1] "% of vertices connected to other communities 0.0909090909090909"
[1] "Manufacture of engines and turbines, except aircraft, vehicle and cycle engines"
[2] "Manufacture of electronic components"
[3] "Manufacture of fibre optic cables"
[4] "Wholesale of hardware, plumbing and heating equipment and supplies"
[5] "Manufacture of batteries and accumulators"
[6] "Machining"
[7] "Manufacture of other special-purpose machinery n.e.c."
[8] "Manufacture of plastic plates, sheets, tubes and profiles"
[9] "Manufacture of machinery for metallurgy"
[10] "Manufacture of electric lighting equipment"
[11] "Manufacture of fluid power equipment"
[12] "Manufacture of basic iron and steel and of ferro-alloys"
[13] "Manufacture of military fighting vehicles"
[14] "Other manufacturing n.e.c."
[15] "Lead, zinc and tin production"
[16] "Installation of industrial machinery and equipment"
[17] "Manufacture of tubes, pipes, hollow profiles and related fittings, of steel"
[18] "Aluminium production"
[19] "Wholesale of other machinery and equipment"
[20] "Manufacture of ovens, furnaces and furnace burners"
[21] "Cold forming or folding"
[22] "Casting of other non-ferrous metals"
[23] "Manufacture of lifting and handling equipment"
[24] "Manufacture of doors and windows of metal"
[25] "Manufacture of light metal packaging"
[26] "Manufacture of plastics and rubber machinery"
[27] "Manufacture of instruments and appliances for measuring, testing and navigation"
[28] "Manufacture of other pumps and compressors"
print_cummunity_vertex(node_list,4, 0.7)
[1] "Community: 4"
[1] "Color: yellowgreen"
[1] "Total number of vertices: 38"
[1] "% of vertices connected to other communities 0.151515151515152"
[1] "Repair and maintenance of ships and boats"
[2] "Renting and leasing of trucks"
[3] "Inland freight water transport"
[4] "Service activities incidental to land transportation"
[5] "Maintenance and repair of motor vehicles"
[6] "Wholesale trade of motor vehicle parts and accessories"
[7] "Tour operator activities"
[8] "Taxi operation"
[9] "Other passenger land transport n.e.c."
[10] "Sale of other motor vehicles"
[11] "Sale of cars and light motor vehicles"
[12] "Manufacture of rubber tyres and tubes; retreading and rebuilding of rubber tyres"
[13] "Wholesale of agricultural machinery, equipment and supplies"
[14] "Travel agency activities"
[15] "Other reservation service and related activities"
[16] "Passenger air transport"
[17] "Inland passenger water transport"
[18] "Sale, maintenance and repair of motorcycles and related parts and accessories"
[19] "Marine fishing"
[20] "Repair and maintenance of aircraft and spacecraft"
[21] "Driving school activities"
[22] "Retail trade of motor vehicle parts and accessories"
[23] "Service activities incidental to water transportation"
[24] "Renting and leasing of air transport equipment"
[25] "Repair and maintenance of other transport equipment"
[26] "Freight rail transport"
print_cummunity_vertex(node_list,5, 0.25)
[1] "Community: 5"
[1] "Color: blue"
[1] "Total number of vertices: 105"
[1] "% of vertices connected to other communities 0.117021276595745"
[1] "Renting and operating of own or leased real estate"
[2] "Manufacture of wallpaper"
[3] "Other telecommunications activities"
[4] "Other publishing activities"
[5] "Computer facilities management activities"
[6] "Public relations and communication activities"
[7] "News agency activities"
[8] "Business and other management consultancy activities"
[9] "Pension funding"
[10] "Trusts, funds and similar financial entities"
[11] "Manufacture of magnetic and optical media"
[12] "Publishing of computer games"
[13] "Wholesale of computers, computer peripheral equipment and software"
[14] "Other printing"
[15] "Retail sale of computers, peripheral units and software in specialised stores"
[16] "Buying and selling of own real estate"
[17] "Manufacture of communication equipment"
[18] "Web portals"
[19] "Pre-press and pre-media services"
[20] "Television programming and broadcasting activities"
[21] "Repair of consumer electronics"
[22] "Book publishing"
[23] "Other amusement and recreation activities"
[24] "Fund management activities"
[25] "Motion picture, video and television programme distribution activities"
[26] "Operation of arts facilities"
print_cummunity_vertex(node_list,6, 0.5)
[1] "Community: 6"
[1] "Color: orange"
[1] "Total number of vertices: 48"
[1] "% of vertices connected to other communities 0.116279069767442"
[1] "Other education n.e.c."
[2] "Hospital activities"
[3] "Operation of historical sites and buildings and similar visitor attractions"
[4] "Other social work activities without accommodation n.e.c."
[5] "Specialist medical practice activities"
[6] "Activities of political organisations"
[7] "Justice and judicial activities"
[8] "Activities of professional membership organisations"
[9] "Activities of other membership organisations n.e.c."
[10] "Residential care activities for mental retardation, mental health and substance abuse"
[11] "General secondary education"
[12] "Foreign affairs"
[13] "Operation of sports facilities"
[14] "Museums activities"
[15] "General public administration activities"
[16] "Public order and safety activities"
[17] "Tertiary education"
[18] "Other residential care activities"
[19] "Defence activities"
[20] "Other sports activities"
[21] "Child day-care activities"
[22] "Activities of business and employers membership organisations"
[23] "Fitness facilities"
[24] "Technical and vocational secondary education"
print_cummunity_vertex(node_list,7, 0.2)
[1] "Community: 7"
[1] "Color: black"
[1] "Total number of vertices: 115"
[1] "% of vertices connected to other communities 0.0952380952380952"
[1] "Retail sale via stalls and markets of other goods"
[2] "Retail sale of tobacco products in specialised stores"
[3] "Event catering activities"
[4] "Raising of other cattle and buffaloes"
[5] "Warehousing and storage"
[6] "Production of meat and poultry meat products"
[7] "Botanical and zoological gardens and nature reserves activities"
[8] "Processing of tea and coffee"
[9] "Wholesale of fruit and vegetables"
[10] "Freshwater fishing"
[11] "Manufacture of prepared meals and dishes"
[12] "Manufacture of margarine and similar edible fats"
[13] "Manufacture of macaroni, noodles, couscous and similar farinaceous products"
[14] "Manufacture of condiments and seasonings"
[15] "Wholesale of coffee, tea, cocoa and spices"
[16] "Marine aquaculture"
[17] "Holiday and other short-stay accommodation"
[18] "Camping grounds, recreational vehicle parks and trailer parks"
[19] "Washing and (dry-)cleaning of textile and fur products"
[20] "Wholesale of dairy products, eggs and edible oils and fats"
[21] "Processing and preserving of potatoes"
[22] "Support services to forestry"
[23] "Manufacture of sugar"